Structured Index System at NTCIR Workshop 2: Information Retrieval Methods Using Ordered Co-occurrence of Words and their Dependency Relationships
نویسندگان
چکیده
We propose two Japanese-language information retrieval methods that enhance retrieval effectiveness by using relationships between words. The first method uses dependency relationships between words in a sentence, while the second method uses proximity relationships, in particular the ordered co-occurrence information of words in a sentence as an approximation to the dependency relationships between them. We construct these two methods on the Structured Index, which represents dependency relationships between words in a sentence as a set of binary trees. Structured Index is created by morphological analysis, dependency analysis, and compound noun analysis. We show the result of retrieval experiments using NTCIR–2, and discuss the effect of using relationships between words on Japanese information retrieval.
منابع مشابه
Structured Index System at NTCIR1: Information Retrieval using Dependency Relationship between Words
It is difficult to improve retrieval effectiveness using only keyword-based retrieval, the major method in document retrieval, due to its high dependence on statistical word distribution. We therefore propose a method to enhance retrieval effectiveness using dependency relationships between words in a sentence. In our method, we create a Structured Index, represented by a binary tree through de...
متن کاملThe Effect of Information Retrieval Method Using Dependency Relationship Between Words
It is difficult to improve retrieval effectiveness by using only keyword-based retrieval due to its high dependence on statistical word distributions. We propose a method that enhances retrieval effectiveness using dependency relationships between words in sentences of documents and queries. In our method, we create a Structured Index, represented by a binary tree, in three steps. These steps a...
متن کاملNTCIR-2 ECIR Experiments at Maryland: Comparing Pirkola's Structured Queries and Balanced Translation
Pirkola’s word-based structured queries have been shown to perform well for word-based cross-language information retrieval in European languages. Monolingual Chinese retrieval experiments, by contrast often find that character bigrams perform as well as (and sometimes better than) automatically segmented words. During the Mandarin-English Information (MEI) project at the Johns Hopkins Summer 2...
متن کاملIASL System for NTCIR-6 Korean-Chinese Cross-Language Information Retrieval
This paper describes our Korean-Chinese cross-language information retrieval system for NTCIR-6. Our system uses a bilingual dictionary to perform query translation. We expand our bilingual dictionary by extracting words and their translations from the Wikipedia site, an online encyclopedia. To resolve the problem of translating Western people's names into Chinese, we propose a transliteration ...
متن کاملNTCIR-3 CLIR Experiments at MSRA
This paper describes three statistical models for the purpose of resolving query translation ambiguity for cross-language information retrieval (CLIR). First, a decaying co-occurrence model is present. It is an extension of traditional co-occurrence models in that it contains a decaying factor which decreases the mutual information when the distance between the terms increases. Second, a phrase...
متن کامل